Drift-free video encoding and decoding method, and corresponding devices

ABSTRACT

Three-dimensional (3D) subband coding schemes use motion compensation in their temporal filtering stage. Unfortunately, this procedure introduces two drawbacks: (a) the MC being applied at the full resolution, a drift appears when decoding at a lower resolution, and (b) all the motion vectors estimated at full resolution are transmitted, which is a waste of bits. According to the invention, a low resolution sequence is first obtained by generating from the original input sequence of frames—by means of a wavelet decomposition—a sequence of low resolution frames and performing on them a motion compensated spatio-temporal analysis. Then, a motion compensated spatio-temporal analysis of each full resolution group of frames is performed, and the low frequency subbands of the decomposition are finally replaced, at each temporal decomposition level, by the corresponding spatio-temporal subbands of the generated low resolution sequence. The modified sequence thus obtained is finally coded. Thanks to this approach, a good behavior at low resolution is maintained (no more drift) while getting closer to the performance of a classic 3D subband codec at full resolution.

The present invention relates to an encoding method for the compressionof an original video sequence divided into successive groups of frames(GOFs) and to a corresponding decoding method. It also relates tocorresponding encoding and decoding devices.

The growth of the Internet and advances in multimedia technologies haveenabled new applications and services for video compression. Many ofthem not only require coding efficiency but also enhanced functionalityand flexibility in order to adapt to varying network conditions andterminal capabilities: scalability answers these needs. Current videocompression standards, often based on a hybrid DCT (Discrete CosineTransform) predictive structure, already include some scalabilityfeatures. The hybrid structures are based on a predictive scheme whereeach frame is temporally predicted from a given reference frame (theprediction options being a forward prediction, for the P frames, or abi-directional prediction, for the B frames) and the prediction errorthus obtained is then spatially transformed (a two-dimensional DCTtransform is used in the standard schemes) to get advantage of spatialredundancies. The scalability is achieved thanks to additionalenhancement layers.

Alternatively, three-dimensional (3D) subband video coding techniquesgenerate a single, embedded bitstream with full scalability. They relyon a spatio-temporal filtering that allows a reconstruction at anydesired spatial resolution or frame rate. Such an approach is forexample proposed in the document “Three-dimensional subband coding ofvideo”, C. Podilchuk and al., IEEE Transactions on Image Processing,vol. 4, No. 2, February 1995, pp. 125-139, where a group of frames (GOF)is processed as a three-dimensional (2D+t, or 3D) structure andspatio-temporally filtered in order to compact the energy in the lowfrequencies (further studies included Motion Compensation in this schemein order to improve the overall coding efficiency).

The 3D subband structure obtained with such an approach is depicted inFIG. 1, where the illustrated 3D wavelet decomposition with motioncompensation is applied to a group of frames (GOF), and this current GOFis first motion-compensated (MC), in order to process sequences withlarge motion, and then temporally filtered (TF) using Haar wavelets (thedotted arrows correspond to a high-pass temporal filtering, while theother ones correspond to a low-pass temporal filtering). After themotion compensation operation and the temporal filtering operation, eachtemporal subband is spatially decomposed into a spatio-temporal subband,which finally leads to a 3D wavelet representation of the original GOF,three stages of decomposition being shown in the example of FIG. 1 (Land H=first stage; LL and LH=second stage; LLL and LLH=third stage). Thewell known SPIHT algorithm, extended from 2D to 3D, is then chosen inorder to efficiently encode the final coefficient bit-planes withrespect to the spatio-temporal decomposition structure.

As it is implemented, this 3D subband structure applies themotion-compensated (MC) spatio-temporal analysis at the full originalresolution at the encoder side. Spatial scalability is achieved bygetting rid of the highest spatial subbands of the decomposition.However, when motion compensation is used in the 3D analysis scheme,this method does not allow a perfect reconstruction of the videosequence at lower resolution, even at very high bit-rates: thisphenomena, referred to as drift in the following description, lowers thevisual quality of the scalable solution compared to a direct encoding atthe targeted final display size. As explained in the document“Multiscale video compression using wavelet transform and motioncompensation”, P. Y. Cheng and al., Proceedings of the InternationalConference on Image Processing (ICIP95), Vol. 1, 1995, pp. 606-609, saiddrift comes from the order of wavelet transform and motion compensationthat is not interchangeable. When a spatial scalability is enabled atthe decoder side, the highest spatial subbands of the decompositionperformed at the encoder side are skipped, which allows thereconstruction, or synthesis, of a low-resolution version ad of theoriginal frame A. For such a synthesis, the following operation isapplied: $\begin{matrix}\begin{matrix}{a = {{{DWT}_{L}(L)} + {{MC}\left\lbrack {{DWT}_{L}(H)} \right\rbrack}}} \\{= {{{DWT}_{L}(A)} + \left\lbrack {{{MC}\left\lbrack {{DWT}_{L}(H)} \right\rbrack} - {{DWT}_{L}\left( {{MC}\lbrack H\rbrack} \right)}} \right\rbrack}}\end{matrix} & (1)\end{matrix}$

where DWT_(L) Discrete Wavelet Transform, in the spatial domain) denotesthe resolution downsample using the same wavelet filters as in the 3Danalysis. In a perfect scalable solution, one wants to have:a=DWT _(L)(A)   (2)

The remaining part of the expression (1) therefore corresponds to thedrift. It can be noticed that, if no MC is applied, the drift isremoved. The same phenomena happens (except at the image borders) if aunique motion vector is applied to the frame. Yet, it is known that MCis unavoidable to achieve a good coding efficiency, and the likelihoodof a unique global motion is small enough to eliminate this particularcase in the following paragraphs.

Some authors, such as J. W. Woods and al in the document “A resolutionand frame-rate scalable subband/wavelet video coder”, IEEE Transactionson Circuits and Systems for Video Technology, vol. 1, No. 9, September2001, pp. 1035-1044, have already proposed technical solutions in orderto get rid of this drift. However, in said document, the describedscheme, in addition to being quite complex, implies the sending of anextra information (the drift correction necessary to correctlysynthesize the upper resolution) in the bitstream, thus wasting somebits. The solution described in the document “Multiscale videocompression . . .” previously cited avoids this bottleneck but works ona predictive scheme and is not transposable to the 3D subband codec.

It has then been proposed, in the European patent application No.02290155.7 (PHFR020002) filed on Jan. 22^(nd), 2002, a solution avoidingthese drawbacks and according to which the video encoding method, usedfor the compression of an original video sequence divided intosuccessive groups of frames (GOFs), comprises the steps of: (1)generating from the original video sequence, by means of a waveletdecomposition, a low resolution sequence including successive lowresolution GOFs; (2) performing on said low resolution sequence a lowresolution decomposition, by means of a motion compensatedspatio-temporal analysis of each low resolution GOF; (3) generating fromsaid low resolution decomposition a full resolution sequence, by meansof an anchoring of the high frequency spatial subbands resulting fromthe wavelet decomposition to said low resolution decomposition; (4)coding said full resolution sequence and the motion vectors generatedduring the motion compensated spatio-temporal analysis, for generatingan output coded bitstream.

Said solution, in which the global structure of the decomposition treein the 3DS analysis is preserved and no extra information is sent tocorrect the drift effect (only the decomposition/reconstructionmechanism is changed), is now recalled in a more detailed manner withreference to the coding scheme of FIG. 2 and to the motion-compensatedtemporal analysis at the lowest resolution, illustrated in FIG. 3.

Two main steps are provided: (a) a motion compensation step at thelowest resolution, (b) an encoding step of the high spatial subbands.First, in order to avoid drift at lower resolutions, Motion Compensation(MC) is applied at this level. Consequently the GOF at full resolution(21 in FIG. 2) is first downsized (this step is indicated by thereference d in FIG. 3, corresponding to the steps 22, 23 in FIG. 2) andthe usual 3D subband MC-decomposition scheme is then applied to thisdownsized GOF instead of the full-size GOF, as depicted in FIG. 3 andillustrated by the step 24 in FIG. 2. In FIG. 3, the temporal subbands(L_(o,d), H_(o,d)) and (L_(1,d), H_(1,d)) are determined according tothe well-known lifting scheme (H is first defined from A and B, and thenL from A and H), and the dotted arrows correspond to the high-passtemporal filtering, the continuous ones to the low-pass temporalfiltering, and the curved ones (between low frequency spatial subbands Aof the frames of the sequence, referenced A_(o,d), A_(1,d,)A_(2,d),A_(3,d), or between low frequency temporal subbands L, referencedL_(o,d) and L_(1,d)) to the motion compensation (it may be noticed thata side effect of this method is the reduction of the amount of motionvectors to be sent in the bitstream, which saves up some bits fortexture coding). Before transmitting the subbands to a tree-basedentropy coder (for instance, as shown in FIG. 2, to a 3D-SPIHT encoder27, such as described for instance in the document “Low bit-ratescalable video coding with 3D set partitioning in hierarchical trees(3D-SPIHT)”, B. J. Kim and al. IEEE Transactions on Circuits and Systemsfor Video Technology, vol. 10, No. 8, December 2000, pp. 1374-1387), oneputs (step 25) the high spatial subbands (26, in FIG. 2) that allow thereconstruction of the full resolution. The final tree structure looksvery similar to that of a 3D subband codec such as the one described inthe document “A fully scalable 3D subband video codec”, IEEE Conferenceon Image Processing (ICIP2001), vol. 2, pp. 1017-1020, Thessaloniki,Greece, Oct. 7-10, 2001, and so a tree-based entropy coder can beapplied on it without any restriction. In the encoding scheme of FIG. 2,the references are the following (for a frame of the full resolutionsequence 21):

FRS: full resolution sequence 21

WD: wavelet decomposition 22

LRS: low resolution sequence 23

MC-3DSA: motion-compensated 3D subband analysis 24

LRD: low resolution decomposition (251)

HS: high subbands 26

U-HFSS: union of the three high frequency spatial subbands of a frame(252)

FR-3D-SPIHT: full resolution 3D SPIHT 27

OCB: output coded bitstream.

The corresponding decoding scheme, depicted in FIG. 4, is symmetric tothis encoder (in FIG. 4, the additional references are the following:

FR-3D-SPIHT: decoding step 41

MC-3DSS: motion compensated 3D subband synthesis 43

HSS: high subbands separation 44

FRR: fall resolution reconstruction 45 of the full resolution sequence).

To enable spatial scalability, the high frequency spatial subbands justhave to be cut as in the usual version of the 3D subband codec, thedecoding scheme of FIG. 4 showing how to naturally obtain the lowresolution sequence.

Then, for coding the high spatial subbands, two main solutions areproposed, the first one without MC, and the second one with MC.

In the first solution, the high subbands simply correspond to the highfrequency spatial subbands of the original (full resolution) frames ofthe GOF in the wavelet decomposition. Those subbands allow thereconstruction at full resolution at the decoding side. Indeed, theframes can be decoded at the low resolution. However, these framescorrespond to the low spatial subband in the wavelet analysis of theoriginal frames. Hence one has merely to put the low resolution framesand the corresponding high subbands together and apply a waveletsynthesis to obtain the full resolution frames, and thus to optimize the3D-SPIHT encoder. In a MC scheme for a 3D subband encoder, the lowtemporal subbands always look like one of the original frames of theGOF. As a matter of fact: $\begin{matrix}{L = {\frac{1}{\sqrt{2}}\left\lbrack {A + {{MC}(B)}} \right\rbrack}} & (3)\end{matrix}$

so L looks like A, Consequently, the high spatial subband of A should beplaced with the low resolution decomposition corresponding to L. Thisapproach (reordering of the high spatial subbands in the case of forwardMC) is illustrated in FIG. 5, where jt indicates the temporaldecomposition level (0 for the full-frame rate, jt_max for the lowestframe rate), nf is the subband index at the temporal level jt, DWT_(H)denotes the high frequency wavelet filter and the coefficients c_(jt)are multiplication coefficients, and OF, LRF, TS respectively designate:the original frames referenced 0 to 3, the low resolution framesreferenced 00 to 03, and the transmitted subbands.

In the second solution, as using MC in every subband does not allow areconstruction with no drift, it is also possible to partially use MC toconstruct the high spatial subbands and still be able to reconstructevery resolution. Instead of directly using the high frequency spatialsubbands of the wavelet decomposition, a wavelet decomposition iscarried out on a prediction error obtained from the MC performed on thefull resolution sequence and reusing for instance the motion vectors ofthe low resolution.

It is then an object of the invention to improve the previouslydescribed solution by keeping its good behavior at low resolution whilegetting closer to the performance of a classic 3D subband codec at fullresolution.

To this end, the invention relates to a video encoding method for thecompression of an original video sequence divided into successive groupsof frames (GOFs), said method comprising the steps of: (1) generatingfrom the full resolution frames of the original video sequence, by meansof a wavelet decomposition, a sequence of low resolution framesorganized in successive low resolution GOFs; (2) performing on each lowresolution GOF of said sequence of low resolution frames a motioncompensated spatio-temporal analysis, leading to a low resolutionsequence; (3) performing a motion compensated spatio-temporal analysisof each full resolution GOF of the original video sequence; (4)replacing at each temporal decomposition level the low-frequencysubbands of said decomposition by the corresponding spatio-temporalsubbands of the low resolution sequence; (5) coding the modifiedsequence thus obtained and the motion vectors generated during themotion compensated spatio-temporal analysis of each full resolution GOF,for generating an output coded bitstream.

The invention also relates to a video decoding method dual of theabove-defined video encoding method, and to the corresponding videoencoding and decoding device.

The invention will now be described in a more detailed manner, withreference to the accompanying drawings in which:

FIG. 1 shows a 3D subband decomposition;

FIG. 2 depicts an embodiment of an encoding scheme according to aprevious embodiment;

FIG. 3 illustrates a motion-compensated temporal analysis at the lowestresolution;

FIG. 4 depicts an embodiment of a decoding scheme corresponding to theencoding scheme of FIG. 2;

FIG. 5 illustrates the reordering of the high spatial subbands (for aforward motion compensation);

FIG. 6 illustrates the main steps of the encoding method according tothe invention;

FIGS. 7A and 7B illustrate the corresponding motion compensated temporalfiltering decomposition scheme;

FIGS. 8A and 8B illustrate at the decoding side an implementation of asynthesis scheme corresponding to the encoding method of FIG. 5.

As for the previously described solution, the present invention is nowexplained with reference to its basic steps: (a) motion compensation atthe lowest resolution (this first step, Motion Compensation (MC), is, infact, strictly equivalent to the one described in the case of theprevious solution: one first downsizes the GOF using the spatial waveletfilters, and the usual 3D subband MC-decomposition scheme is thenapplied to this downsized GOF), (b) encoding the high spatial subbands.

The main difference with said previous solution lies in the second step,the principle of which is to inject at each decomposition level thetemporal subbands of the low spatial resolution analysis into those ofthe full-resolution one. It is thus possible to reconstruct the originalframes at the decoder side while performing a real temporal filtering(and not just an intra coding or a predictive difference—as in theprevious solution—for the high frequency spatial subbands).

The following equations explain the mechanism in a more detailed manner.As said above, the first temporal analysis is performed at lowresolution, which may be expressed by the equations (4) and (5):H _(d) =[B _(d) −MC _(down)(A _(d))]/√{square root over (2)}  (4)L _(d)=[√{square root over (2)}·A _(d) +MC _(down) ⁻¹(H _(d))]  (5)with the following notations:

A=reference flame

B=current frame

DWT=discrete wavelet transform

A_(d)=low-frequency spatial subband of the DWT of frame A, i.e. alow-spatial resolution version of frame A

B_(d)=low-frequency spatial subband of the DWT of frame B, i.e alow-spatial resolution version of frame B

H=high-frequency temporal subband at the low spatial resolution

L=low-frequency temporal subband at the low spatial resolution

MC_(down)=motion compensation performed on low-resolution (i.e.sub-sampled) frames

MC⁻¹=inverse motion compensation (motion vectors computed to predict aframe B from a frame A are reversely used to predict the frame A fromthe frame B) The equations (6) to (9) then allow to define L_(s) andH_(s):H′=B−MC _(full)(A)   (6)L′=√{square root over (2)}·A+MC _(full) ⁻¹(H)   (7)H_(s)=H′  (8)L _(s)=√{square root over (2)}·l′  (9)with:

X_(s)=union of the three high-frequency spatial subbands of the DWT of agiven frame X (with X_(S)=H_(S) or L_(S))

MC_(full)=motion compensation performed on full-resolution frames

L′ and H′=respectively the low-frequency and high-frequency temporalsubbands in a conventional 3D subband schemeH=DWT ⁻¹ [H _(d) ∪H _(s)]L=DWT ⁻¹ [L _(d) ∪L _(s)]

Once all the low-frequency and high-frequency temporal subbands havebeen generated at a given temporal level jt, both at low and fullspatial resolutions, the low-frequency temporal subbands L are furtherdecomposed to achieve the next temporal level jt+1.

This is repeated at each step of the temporal decomposition, leadingfinally to a structure of the temporal decomposition which is verysimilar to that of a classic 3D subband encoder. The low frequencytemporal subband of the last level and the high frequency temporalsubbands of all levels are then spatially decomposed through waveletfilters and encoded to form the bitstream.

The described invention keeps the good behavior of the previous solutionat low resolution while getting closer to the performance of a classic3D subband codec at full resolution (the global structure of thedecomposition tree in the 3D subband analysis is preserved and no extrainformation is sent to correct the drift effect; only thedecomposition/reconstruction mechanism is changed). The main upgradecomes from the new approach to generate the high-frequency spatialsubbands, that brings more coherence to the decomposition tree andtherefore improves the coding efficiency of the system.

At the decoder, all the previous equations can be reverted to allow agood reconstruction. Only a ˆ is added to every subband in order toindicate that decoding is now concerned and that some information mighthave been lost. First a classic 3D subband synthesis at low resolutionallows to give back the low spatial resolution subbands A_(d) and B_(d)from L_(d) and H_(d): $\begin{matrix}{{\hat{A}}_{d} = {\frac{1}{\sqrt{2}}\left\lbrack {{\hat{L}}_{d} - {{MC}_{down}^{- 1}\left( {\hat{H}}_{d} \right)}} \right\rbrack}} & (10) \\{{\hat{B}}_{d} = {{{MC}_{down}\left( {\hat{A}}_{d} \right)} + {\sqrt{2}*{\hat{H}}_{d}}}} & (11)\end{matrix}$

It is also easy to get A_(s) by synthesizing H and by reverting theequation (7). The process is explained by the equations (12) to (15):$\begin{matrix}{\hat{H} = {{DWT}^{- 1}\left\lbrack {{\hat{H}}_{d}\bigcup{\hat{H}}_{s}} \right\rbrack}} & (12) \\{\hat{L} = {{DWT}^{- 1}\left\lbrack {{\hat{L}}_{d}\bigcup{\hat{L}}_{s}} \right\rbrack}} & (13) \\{{\hat{A}}_{s}^{n} = {\frac{1}{\sqrt{2}}\left\lbrack {\hat{L} - {{MC}_{full}^{- 1}\left( \hat{H} \right)}} \right\rbrack}} & (14) \\{{\hat{A}}_{s} = A_{s}^{n}} & (15)\end{matrix}$

Then Â is simply reconstructed from Â_(d) and Â_(s). Consequently onecan get B_(s) and finally synthesize B. This is summarized by the systemof equations (16) to (19):Â=DWT ⁻¹ [Â _(d) ∪Â _(s)]  (16){circumflex over (B)} _(s) ^(n) =MC _(full)(Â)+Ĥ  (17){circumflex over (B)}_(s)={circumflex over (B)}^(n) _(s)   (18){circumflex over (B)}=DWT ⁻¹ [{circumflex over (B)} _(d) ∪{circumflexover (B)} _(s)]  (19)

These operations are repeated until the very first temporal level, i.e.until the GOF is fully decoded. It can clearly be seen that this schemegenerates no drift since perfect reconstruction is achieved as soon as Land H are completely transmitted in the bit-stream (it can also be notedthat the full spatial resolution synthesis is now intimately linked withthe low resolution one at each temporal level, which was not the case inthe previous solution).

The encoding principle defined above is now described in a more detailedmanner, with reference to FIG. 6, that illustrates the main steps of theencoding method, and FIG. 7 (comprising in fact two Figures: FIG. 7A andFIG. 7B), that illustrates in a more detailed manner the correspondingmotion compensated temporal filtering scheme.

In the encoding scheme of FIG. 6, the original group of frames GOF (thiscurrent GOF comprises full resolution frames FRF) is first used forgenerating, by means of a wavelet decomposition WD, low resolutionframes LRF on which a motion compensated spatio-temporal analysis MCSTAis then performed. A low resolution sequence is thus obtained. Theoriginal full resolution frames (i.e. each full resolution GOF) are alsoused for performing a motion compensated spatio-temporal analysis (thecorresponding successive steps MCSTA and WD correspond to a:“MC-temporal analysis” and a “wavelet decomposition”) for generatinghigh spatial subbands HSS.

After these two parallel sets of steps performed on the full resolutionframes, the low frequency subbands of the decomposition thus obtainedare iteratively replaced, at each temporal decomposition level, by thecorresponding spatio-temporal subbands of the low resolution sequenceLRS, according to the following operations: (a) first, a storingoperation 62, for storing the high frequency spatio-temporal subbands ofthe decomposition in view of the final encoding step 69; (b) then awavelet synthesis 63, performed from the low frequency spatio-temporalsubbands of said decomposition (a test 61 “L or H temporal subband” hasallowed to separate said low frequency and high frequencyspatio-temporal subbands); (c) then a test 64 concerning the rank of thetemporal decomposition level, for storing (65) the low frequencyspatio-temporal subbands of the decomposition if said level is the lastone, the two parallel sets of steps being on the contrary furthercarried out for the next temporal level (66) if said level is not thelast one.

More detailed representations of the whole decomposition scheme (at theenconding side) and the corresponding motion-compensated synthesisscheme (at the decoding side) can be seen in FIG. 7 and FIG. 8 (alsocomprising two Figures: FIG. 8A and FIG. 8B) respectively. This exampleof a spatio-temporal decomposition according to the invention is relatedto a GOF of only four frames A0 to A3 (for the sake of simplicity), witha forward motion compensation and two decomposition levels. The high andlow frequency (H′₀, H′₁ and L′₀, L′₁ respectively) temporal subbands arecomputed from the original frames by using the so-called lifting scheme,described for instance in the document “Factoring wavelet transformsinto lifting steps”, I. Daubechies and W. Sweldens, Bell Laboratoriestechnical report, Lucent Technologies, 1996. The notations DWT and DWT⁻¹respectively designate the wavelet decomposition and the waveletsynthesis. The right side of FIG. 7 illustrates successively the firstspatio-temporal decomposition level, the inverse synthesis applied tothe low frequency spatio-temporal subbands of the decomposition and thesecond spatio-temporal decomposition level (performed after thereplacement of the low frequency subbands of the decomposition by thecorresponding spatio-temporal subbands of the low resolution sequence,said replacement being indicated by the arrows coming from the left sideof FIG. 7).

The video encoding method and device according to the invention havebeen described above in a detailed manner, but it is clear that theinvention also relates to a corresponding video decoding method, thatcomprises successive steps dual of the steps performed when implementingsaid video encoding method, and to a corresponding video decodingdevice, that comprises successive means dual of the means provided insaid video encoding device.

1. A video encoding method for the compression of an original videosequence divided into successive groups of frames (GOFs), said methodcomprising the steps of: (1) generating from the fill resolution framesof the original video sequence, by means of a wavelet decomposition, asequence of low resolution frames organized in successive low resolutionGOFs; (2) performing on each low resolution GOF of said sequence of lowresolution frames a motion compensated spatio-temporal analysis, leadingto a low resolution sequence; (3) performing a motion compensatedspatio-temporal analysis of each full resolution GOF of the originalvideo sequence; (4) replacing at each temporal decomposition level thelow-frequency subbands of said decomposition by the correspondingspatio-temporal subbands of the low resolution sequence; (5) coding themodified sequence thus obtained and the motion vectors generated duringthe motion compensated spatio-temporal analysis of each fill resolutionGOF, for generating an output coded bitstream.
 2. A video encodingdevice for the compression of an original video sequence divided intosuccessive groups of frames (GOFs), said device comprising: (1) meansfor generating from the full resolution frames of the original videosequence and by means of a wavelet decomposition, a sequence of lowresolution frames organized in successive low resolution GOFs; (2) meansfor performing on each low resolution GOF of said sequence of lowresolution frames a motion compensated spatio-temporal analysis, leadingto a low resolution sequence; (3) means for performing a motioncompensated spatio-temporal analysis of each full resolution GOF of theoriginal video sequence; (4) means for replacing at each temporaldecomposition level the low-frequency subbands of said decomposition bythe corresponding spatio-temporal subbands of the low resolutionsequence; (5) means for coding the modified sequence thus obtained andthe motion vectors generated during the motion compensatedspatio-temporal analysis of each full resolution GOF, for generating anoutput coded bitstream.
 3. A video decoding method, provided fordecoding a coded bitstream corresponding to a video sequence coded bymeans of a video encoding method comprising, for the compression of saidoriginal video sequence, the steps of: (1) generating from the fullresolution frames of the original video sequence, by means of a waveletdecomposition, a sequence of low resolution frames organized insuccessive low resolution GOFs; (2) performing on each low resolutionGOF of said sequence of low resolution frames a motion compensatedspatio-temporal analysis, leading to a low resolution sequence; (3)performing a motion compensated spatio-temporal analysis of each fullresolution GOF of the original video sequence; (4) replacing at eachtemporal decomposition level the low-frequency subbands of saiddecomposition by the corresponding spatio-temporal subbands of the lowresolution sequence; (5) coding the modified sequence thus obtained andthe motion vectors generated during the motion compensatedspatio-temporal analysis, of each full resolution GOF, for generating anoutput coded bitstream; said video decoding method comprising successivesteps that are dual of the steps performed according to the videoencoding method of claim
 1. 4. A video decoding device, provided fordecoding a coded bitstream corresponding to a video sequence coded bymeans of a video encoding device comprising, for the compression of saidoriginal video sequence: (1) means for generating sequence from the fullresolution frames of the original video sequence and by means of awavelet decomposition, a sequence of low resolution frames organized insuccessive low resolution GOFs; (2) means for performing on each lowresolution GOF of said sequence of low resolution frames a motioncompensated spatio-temporal analysis, leading to a low resolutionsequence; (3) means for performing a motion compensated spatio-temporalanalysis of each full resolution GOF of the original video sequence; (4)means for replacing at each temporal decomposition level thelow-frequency subbands of said decomposition by the correspondingspatio-temporal subbands of the low resolution sequence; (5) means forcoding the modified sequence thus obtained and the motion vectorsgenerated during the motion compensated spatio-temporal analysis, ofeach full resolution GOF, for generating an output coded bitstream; saidvideo decoding device comprising successive means that are dual of themeans provided in the video encoding device according to claim 2.